Deep Dependencies from Context-Free Statistical Parsers: Correcting the Surface Dependency Approximation
نویسندگان
چکیده
We present a linguistically-motivated algorithm for reconstructing nonlocal dependency in broad-coverage context-free parse trees derived from treebanks. We use an algorithm based on loglinear classifiers to augment and reshape context-free trees so as to reintroduce underlying nonlocal dependencies lost in the context-free approximation. We find that our algorithm compares favorably with prior work on English using an existing evaluation metric, and also introduce and argue for a new dependency-based evaluation metric. By this new evaluation metric our algorithm achieves 60% error reduction on gold-standard input trees and 5% error reduction on state-ofthe-art machine-parsed input trees, when compared with the best previous work. We also present the first results on nonlocal dependency reconstruction for a language other than English, comparing performance on English and German. Our new evaluation metric quantitatively corroborates the intuition that in a language with freer word order, the surface dependencies in context-free parse trees are a poorer approximation to underlying dependency structure.
منابع مشابه
Wide-Coverage Deep Statistical Parsing Using Automatic Dependency Structure Annotation
A number of researchers (Lin 1995; Carroll, Briscoe, and Sanfilippo 1998; Carroll et al. 2002; Clark and Hockenmaier 2002; King et al. 2003; Preiss 2003; Kaplan et al. 2004;Miyao and Tsujii 2004) have convincingly argued for the use of dependency (rather than CFG-tree) representations for parser evaluation. Preiss (2003) and Kaplan et al. (2004) conducted a number of experiments comparing “deep...
متن کاملOn Different Approaches to Syntactic Analysis Into Bi-Lexical Dependencies An Empirical Comparison of Direct, PCFG-Based, and HPSG-Based Parsers
We compare three different approaches to parsing into syntactic, bi-lexical dependencies for English: a ‘direct’ data-driven dependency parser, a statistical phrase structure parser, and a hybrid, ‘deep’ grammar-driven parser. The analyses from the latter two are post-converted to bilexical dependencies. Through this ‘reduction’ of all three approaches to syntactic dependency parsers, we determ...
متن کاملFrom Surface Dependencies towards Deeper Semantic Representations
In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More re...
متن کاملA Statistical Constraint Dependency Grammar (CDG) Parser
CDG represents a sentence’s grammatical structure as assignments of dependency relations to functional variables associated with each word in the sentence. In this paper, we describe a statistical CDG (SCDG) parser that performs parsing incrementally and evaluate it on the Wall Street Journal Penn Treebank. Using a tight integration of multiple knowledge sources, together with distance modeling...
متن کاملEvaluation of Dependency Parsers on Unbounded Dependencies
We evaluate two dependency parsers, MSTParser and MaltParser, with respect to their capacity to recover unbounded dependencies in English, a type of evaluation that has been applied to grammarbased parsers and statistical phrase structure parsers but not to dependency parsers. The evaluation shows that when combined with simple post-processing heuristics, the parsers correctly recall unbounded ...
متن کامل